ES3: A Demonstration of Transparent Provenance for Scientific Computation
نویسندگان
چکیده
The Earth System Science Server (ES3) is a software environment for data-intensive Earth science, with unique capabilities for automatically and transparently capturing and managing the provenance of arbitrary computations. Transparent acquisition avoids the scientist having to express their computations in specific languages or schemas for provenance to be available. ES3 models provenance as relationships between processes and their input and output files. These relationships are captured by monitoring read and write accesses at various levels in the science software and asynchronously converting them to time-ordered streams of provenance events which are stored in an XML database. An ES3 provenance query returns an XML serialization of a provenance graph, forward or backwards from a specified process or file. We demonstrate ES3 provenance by generating complex data products from Earth satellite imagery.
منابع مشابه
Automatic capture and reconstruction of computational provenance
The Earth System Science Server (ES3) project is developing a local infrastructure for managing Earth science data products derived from satellite remote sensing. By ‘local,’ we mean the infrastructure that a scientist uses to manage the creation and dissemination of her own data products, particularly those that are constantly incorporating corrections or improvements based on the scientist’s ...
متن کاملComputational provenance in hydrologic science: a snow mapping example.
Computational provenance--a record of the antecedents and processing history of digital information--is key to properly documenting computer-based scientific research. To support investigations in hydrologic science, we produce the daily fractional snow-covered area from NASA's moderate-resolution imaging spectroradiometer (MODIS). From the MODIS reflectance data in seven wavelengths, we estima...
متن کاملAutomatic Provenance Collection and Publishing in a Science Data Production Environment - Early Results
The Earth System Science Server (ES3) system transparently collects provenance information from executing code. Provenance information (ancestors or descendants) for any process or data granule may then be retrieved from a web service, in both textual and graphical formats. We have installed ES3 in a quasi-production environment, wherein multiple Earth satellite data streams are synthesized int...
متن کاملToward a Theory of Self-explaining Computation
Provenance techniques aim to increase the reliability of human judgments about data by making its origin and derivation process explicit. Originally motivated by the needs of scientific databases and scientific computation, provenance has also become a major issue for business and government data on the Web. However, so far provenance has been studied only in relatively restrictive settings: ty...
متن کاملA Provenance-Based Fault Tolerance Mechanism for Scientific Workflows
Capturing provenance information in scientific workflows is not only useful for determining data-dependencies, but also for a wide range of queries including fault tolerance and usage statistics. As collaborative scientific workflow environments provide users with reusable shared workflows, collection and usage of provenance data in a generic way that could serve multiple data and computational...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008